Unambiguity of Extended Regular Expressions in SGML Document Grammars

نویسنده

  • Anne Brüggemann-Klein
چکیده

In the Standard Generalized Markup Language (SGML), document types are deened by context-free grammars in an extended Backus-Naur form. The right-hand side of a production is called a content model. Content models are extended regular expressions that have to be unambiguous in the sense that \an element : : : that occurs in the document instance must be able to satisfy only one primitive content token without looking ahead in the document instance." In this paper, we present a linear-time algorithm that decides whether a given content model is unambiguous. A similar result has previously been obtained not for content models but for the smaller class of standard regular expressions. It relies on the fact that the languages of marked regular expressions are local|a property that does not hold any more for content models that contain the new &-operator. Therefore, it is necessary to develop new techniques for content models. Besides solving an interesting problem in formal language theory, our results are relevant for developers of SGML systems. In fact, our deenitions are causing changes to the revised edition of the SGML standard, and the algorithm to test content models for unambiguity has been implemented in an SGML parser.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extensions of Attribute Grammars for Structured Document Queries

Widely-used document speciication languages like, e.g., SGML and XML, model documents using extended context-free grammars. These diier from standard context-free grammars in that they allow arbitrary regular expressions on the right-hand side of productions. To query such documents, we introduce a new form of attribute grammars (extended AGs) that work directly over extended context-free gramm...

متن کامل

The Influence of Lookahead in Competitive Paging Algorithms (Extended Abstract)

Efficient Self Simulation Algorithms for Reconfigurable Arrays p. 25 Optimal Upward Planarity Testing of Single-Source Digraphs p. 37 On Bufferless Routing of Variable-Length Messages in Leveled Networks p. 49 Saving Comparisons in the Crochemore-Perrin String Matching Algorithm p. 61 Unambiguity of Extended Regular Expressions in SGML Document Grammars p. 73 On the Direct Sum Conjecture in the...

متن کامل

Standard Generalized Markup Language: Mathematical and Philosophical Issues

The Standard Generalized Markup Language (SGML), an ISO standard, has become the accepted method of deening markup conventions for text les. SGML is a metalanguage for deening grammars for textual markup in much the same way that Backus{Naur Form is a metalanguage for deening programming-language grammars. Indeed, HTML, the method of marking up a hypertext documents for the World Wide Web, is a...

متن کامل

Regular Expressions into Finite Automata

It is a well-established fact that each regular expression can be transformed into a non-deterministic nite automaton (NFA) with or without-transitions, and all authors seem to provide their own variant of the construction. Of these, Berry and Sethi 4] have shown that the construction of an-free NFA due to Glushkov 10] is a natural representation of the regular expression, because it can be des...

متن کامل

SGML and XML Document Grammars and Exceptions

The Standard Generalized Markup Language (SGML) and the Extensible Markup Language (XML) allow users to de ne document type de nitions (DTDs), which are essentially extended context-free grammars expressed in a notation that is similar to extended Backus{Naur form. The right-hand side of a production, called a content model, is both an extended and a restricted regular expression. The semantics...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993